Skip to content

fix(sort): record output_batches, output_bytes and end_time for when not using merge sort#22878

Merged
rluvaton merged 17 commits into
apache:mainfrom
rluvaton:fix-sort-metrics
Jul 2, 2026
Merged

fix(sort): record output_batches, output_bytes and end_time for when not using merge sort#22878
rluvaton merged 17 commits into
apache:mainfrom
rluvaton:fix-sort-metrics

Conversation

@rluvaton

@rluvaton rluvaton commented Jun 10, 2026

Copy link
Copy Markdown
Member

Which issue does this PR close?

N/A

Rationale for this change

When only using in mem sort without going through the StreamingMergeBuilder some metrics are not being recorded

What changes are included in this PR?

wrap in ObservedStream for streams that returned to the next operator

Are these changes tested?

yes

Are there any user-facing changes?

no

@github-actions github-actions Bot added the physical-plan Changes to the physical-plan crate label Jun 10, 2026
@github-actions github-actions Bot added the core Core DataFusion crate label Jun 10, 2026
@rluvaton rluvaton marked this pull request as ready for review June 11, 2026 22:24

@kosiew kosiew left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rluvaton
This looks good to me. The final SortExec output stream now records the baseline output_rows, output_batches, output_bytes, and end_time exactly once, including the fast paths that skip StreamingMergeBuilder. I only have a couple of small cleanup suggestions to make that invariant easier to preserve.

Comment thread datafusion/physical-plan/src/sorts/sort.rs Outdated
Comment thread datafusion/physical-plan/src/sorts/multi_level_merge.rs Outdated
@alamb

alamb commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Shall we merge this PR? It seems to have some conflicts

@kosiew

kosiew commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@rluvaton
Can you resolve the merge conflicts?

rluvaton added 4 commits July 2, 2026 13:56
# Conflicts:
#	datafusion/physical-plan/src/sorts/multi_level_merge.rs
#	datafusion/physical-plan/src/sorts/sort.rs
@rluvaton

rluvaton commented Jul 2, 2026

Copy link
Copy Markdown
Member Author

I hate GitHub Notifications, I miss so many

fixed comments and conflicts

@github-actions github-actions Bot added the sqllogictest SQL Logic Tests (.slt) label Jul 2, 2026
@rluvaton rluvaton added this pull request to the merge queue Jul 2, 2026
Merged via the queue into apache:main with commit e1c9012 Jul 2, 2026
38 checks passed
@rluvaton rluvaton deleted the fix-sort-metrics branch July 2, 2026 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants